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ABSTRACT 


Over the past several years benchmarking has been devel- 
oped into an effective technique for performance analyses of 
computer systems. Relational database machines are rzela- 
tively new compter systems for which a benchmarking tech- 
higue does not yet exist. 

The benchmarking of relational database machines 
involves the indentification and design of test programs 
through which relevant performance data can be gathered and 
interpreted. All features of relational database management 


must be censidered when designing these test vrograms. Ea 


Mm 


join operations are an inportant feature of relational data- 
base management. 

The test pregrams for the jein operations necessarily 
Mierude the repetition of certain queries during which 
specific join parameters are vari2d. These parameters 
include: tuole size, relation size, disk placement, and the 
us¢ cf indices. A number of join operations have béen 
benchmarked. These operations are equality JO. 55, 
inequality joins, three-way jeins, and virtual joins (i.é4., 
views). Monee das 62 Cn, a number of relational database 
feemrne COnfigurations have been utilized fcr benchmarking 
mie ICln Cperations. 

PHewiagulegh=s of the thesis can be found in its contri- 
Piwneneec a sbDenchmarking technique for the join operatio 
and its conclusions on the performance analyses of vario 


relational machines in operating joins. 
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A. WHAT IS BENCHMARKING? 


The term "benchmark" has its origin in «he field of 
geograrhical surveving. A benchmark is @ permanent 
geographic feature which serves as a landmark for surveying. 
Tie term has evolved into defining a standard or criterion 
Peete ccad wie lea Part2cular type of system or vroduct. 
This standard serves as a point of reference to which func- 
tioOnaliy similar systems or products can be compared. 

In the rsalin of computer scienc? a benchmark consists o 
—MEcroaragand Se> Of AanStructions or programas. The execution of 


the set cn cne system provides measurements that can ode used 


to ccmpare with measurements obtained by runnin the same 
set on another systsn. Tie omesenee NO ecsencS ~or CcORDU er 
system benchmarking: Tie aOcco Oh asOnaductindg con rolled 


experiments to collect indicators of comparative performances 


GZ adiltteren.= computer systems. 


Bee ac “GIBSON MIX" 


Ccemparisons of computer systams were prompted by the 
increasing application of the systems in business and other 
situations in a cost-effective way. This interest in compa- 
Tativse performance of systems had resuited in the contrclied 
Sor-aeateg=aoe the systems. In 1970, J.C. Gibson introduced 
asystem of programs sets or "mixes" by which variable *tyces 
CemweceniGacs COuld be compared. tiem Ge pson Max! approcch 
to ccmpazing systems is based on testing several sets of 
applications in both business and science. The results, 
executio times, cf these tests were published. The 


Pegmmuemmorrsele=c<ing a particular computer system could be 
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reduced tc establishing workloads as multiples of the mix. 
By properly balancing execution times and mix multipiss, 
system evaluators could produc comparative estimates for 
total computer systems. 
C. BENCHMARK DESIGN AND OBJECTIVES 

Benchmarking as a technique for comparisons of computer 
performance has enjoyed increasing popularity over the past 
dacadée. This approach is appealing both to producers and 
consumers. Basic guidéiines have been devsloped for the 
proper use of benchmarks. The benchmark must be representa- 


feves Cf reai-worild -werkloads, 
should be inclusive ¢nough to 
as 


pessible. Additionally, 


content must be justifiable. 
fully designed, 


the proper 


and cbhjectives 


Sou cnat sequence 


progressicn can be set down. 


maomecOwards procurement, design analysis, 


meecdu2en, Glality determinations, load analysis, 


of performance, or 
The 


to the objective and deal with 


requiring the benchmazrk. 


Maeen Initially formed the basi 


the benchmarking. 


design cthreugh implementation and throughout 


tion of the results. 


11 


cther objectives 


S 


The benchmark 


Anduecemo enix Of Instruct 2ors 


provide as much relevant data 
the relevance of benchmark 


The 
Should be specifically stated 


benchmark should be care- 


of steps in the benchmark 


Objectives may include evalua- 


component certi-~ 
improvement 
as determined by these 

benchmark should be tailored 
those demands or apolications 


of and the requirement for 


MusSteoes weGoOn= rolled fom 


the interpreta- 





The ¢xperiments descriped in this paper have ob 


M 


en 
conducted on several configurations of an RDM 1100 at the 
Data Processing Service Center West, Naval Air eae none 
Peaunt Mugu, California. The RDM 1100 and itS various 
configuraticns are relational database machines, each of 
which is désign#d to be th2 backend of UNIVAC 1100 serizs 


computers. 


Be THE HCST COMPUTER 


The hcest computer system of which a relational database 
machine is used as the backend is the UNIVAC 1700/42. No 
modifications have been required of the UNIVAC operating 
SYStEn. Specially designed host-resident software has been 
installed in the UNIVAC. 


Be THE HCST COMPUTER/DATABASE MACHINE INTERFACE 


Figure 2.1 depicts the presently available ae¢t+hods for 
interfacing between the host and the backend. The fijss¢ 
metned, the relational query languag?, is a command inrter- 
face. The second methced allows tha user to éxecuts a series 
of queries ty referring to a set of stored commands. The 
third méthod is via usér programs written in high-level 
programming languages such as COBOL and FORTRAN in which a 
Subrcutine is provided for accessing data stored in the 
backend machine. 

In the RDM, host interfacing is accomplished by both 
parallel and serial interface modules (processors) (see 
Padi ce 22). hach interface moduls can suppor= up to 8 host 


systems. 


eZ 
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Figure 2.1 The Host/Backend Interface. 
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C. THE EENCHMARKED RELATIONAL DATABASE MACHINE 


The tasic relational database machine on which the 
benchmarking experiments have beén conducted is a modular 


designed, mMicreprocessor-based database co 


‘og 3 


modules are organized azcund a single high-s 


Besgure 2.2 again). 
1. Technology and Functionalizy of Module 


a. The Datakase Processor 


vr 
2 
(D 


This-Z8000 s¢eries microprocessor 


eC 
flow cf data by translating user queries inte pro 
e 


‘ iD 
Ss fm WN 
rw} 
| 
iD 
in 
6 


i4 
Q) 
Te) 
ih 


Additionally, this processor supervises syst 


“a 


coordinates hardware monitoring, and performs bus arbitra- 
eon « The processor contains approximately 99% of C-codes 
and cperates at 1/2 MIP. If the database accelerator 
(described below) 1s availéeble, “he databasS processor 


senses its availability and issues calls for its Services. 
bo Une Ace elerator 


fis | aangn-s peed, auxilary processor which 
Pareto mametrmG-20nS at 10 MIPS is built ftom ECL logic. 
It has a three-stage fipeline and is designed +9 optimize a 
well-defined collection cf often used database managenent 
BaDroOUutLnes. The accelerator can filter data at disk 


transfer rates. 
Ge ine Cache 


This main memory is composed of 64K dynémic tan 
chips and is expandable up to 6 megabytes. Sys cel Laees la— 
tion and code occupy approximately 360K cf *his memory. 
Cache is allocated in 2K blccks, contiguously whenever 
pessible. The Baoan algoritaon is bas 2 ca ily 


Least-Recently-Used, and the system code is never paged out 


ite 





r) 


d. Disk Drives and the Secondary Storage 


Bvewarek Com-eroller molule pefforms burs* error 
HemeeL On and correction and retry without interv@ntion by 
the database processcr. This controller can manage from one 


to four disk drives with each drive having a capacity of one 
te four disks. Presently there ar2 two disks available with 
each disk capable of storing approximately 600 megabytes of 


data. 


2. Different Accelerator and Cache Contiqureticns 


The benchmarking experiments have been conducted on 


moe following ditferert machine configurations; 


a. 1/2-megabyte cache without the database 


acceleratcr 
bk. 2-megabytes cache with the database accelerator 


c. 2-megabytes cache Wo eos the database 


acceleratcr 


D. THE DATABASES 


The reletional database machine handles data in 2K byte 
bisicks. With this in mind a synt2sized database has been 
Seavened. Tupls (reccrd) lengths of 100 bytes, 200 byt¢s, 
1009 bytes, and 2000 byt2s have been chosen, thereby 
providing a rang2 of 1to 20 tuples per block. I+ has been 
sought thrcugh experimentation to contrast the same opera- 
tions performed on relations with different numbers of 


mune s Der block. It 1s felt that this a 


0D SO 


proach may previde 
iE 


some measurs= of processcr-overhead time versus I/O time. 
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1. Database Generation 


Seiudarad cenelatecsr efecs cach of the four different 
tuple lengths have been designed. Table I describes these 
templates. Note that within each template there are atitri- 
butes (fields) that are common to all four ‘templates: 
Sequential integers and random integers. The attributes of 
seguéential and srandcem intégers can be used +9 enforce 
different orderings cf the same data. Each template also 
contains attributes specified with values uniformly distri- 
buted over a number of enumerated values. BY ensuring 
Bpeerrficd distribution, the reliability of equality joins 
can re assured. 

The actual relations for the experimental databases 
have keen generated cn an IBM 3033 system in batch mode and 
have been transferred to tape for transport to the UNIVAC 
systen. Fer each of the four tuple lengths, relations have 
been generated with 500, 1000, 2500, 5000, and 10000 *upiés. 


2. Database Creation, Loading, and Disk Placement 


In the environment cf the database machine, the 
number cof 2K-byte bleccks assigned to a databas2 is specified 
With *+he CREATE DATABASE command in the quéry language 
(Section II.E further describes the gusry language). Since 
database allocations are made in the whois number of cylin- 
Gers, che rumber of Elocks specified will be rounded up to 
the first whcle number of cylinders. Once the allocation is 
made, the number of LElocks actually allocated is raturned to 
the user. The syntax for database creation in «he query 


language is: 


CREATE DATABASE (name) WITH (options) 


ey 








| 
| TABLE [I | 
| Standard Tuple Templates 
| 





| 
{ LOUO-BYTE ‘ Z2OO-8BYTE i 1LONU-3SYTE } 2000-687 TE | 
$. RELATION j RELATION t RELATION ! KFLAT ION L 
fF TELO rrpe f FIELD TyvPE {| FIELD ivee ) Flere roo 
a Se a me a me ee fe ae a fm nr rn] ar an rn nn oe | 
! KEY to 1 KEY [4 { KEY | KEY 14 4 
IMETRRURZ Cli § MIRROR Til ) MERROR Cll 1 MIRRGR Cit § 
{ RANU 16 1} WANTS 14 } Rand Ia 3 RAND 1a ff 
HUN TQRAND T& $F YNTORAND [& ¢{ CHARS C53 f CHARS C79 4 
4 CHARS Co $$ CHARS C14) | PS Cla 64'S co "4 
{LETTE Gi “' Lerircr Cc! im 19 CQO} vio Cy 
' rS CY } aS cv | a2 cy ! 220 co f 
{ #10 cy f 10 co { a C9 } 2 Cc? 
1 »20 cy 1 P20 CY i; ©8830 co t ¥%0 co 
' P 2s cy J BS C9 { P45 C9 { 1 4&Y Cc? 4 
C35 co 3 P40 CO { P40 cy 4 PSO cQ } 
4 PSO cy $ P35 { 245 C9 1 P46U co 4 
i P75 cy § PaO cy f a) C9 i » 70 co 4 
$ 230 cy P45S cv? } 7690 co 4 ers c> 1 
a $ Ppso co $ 65 cy ¢ 890 C2 FF 
{ PSS CS ! P79 cy ! P90 C2 
| ' PSO Cc? $$ P75 cy ¢ P1009 Cat 
{ } PHS co} 49 cy ft VP?!:O UC2551 
| t P?7O co f 8s C9 § Ve2QO yVC2554 
| t ~ 7S c9 1 PON Cod ft Ue 25 UCLSY t 
} #3 eo i Pf UN Cc? ft vuPrPSY vuC25351 
| f Pas Co 3 UPED 22551 UP7s UC2541 
| 1 99 CQ 4) UP25 UC255! UPAO 1G 2am 
| 1 P100 cod 4 uPso UC 2554 UP1 00 UC2554 
| 
| | 
| 
| FIELD TYPES | 
: C~ COMPRESSED CHARACTER STRING 
(mMaxihum OF 255 CHARACTERS) | 
UC - YJNCOMPRESSFD CHARACTER STRING | 
EM4AXIMUM OF 255 CHAP ACTERS) | 
| [4 - FOUR-BYTE INTEGER 
{ THIS = TEL MAY CONTAIN ANY [NTEGEHR VA_UF 
| Neaiwee 
~2014744%53.688 AND ¢27.1 47,4334 547 
| | 
| | 
| 
| 
a se ee Se ee 
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ER. On S: 
demand - the number of blocks to allccéte 


eck = t hema tskeonmwmach allocation ~S’desired 


For ¢xamele, 
CREATE DATABASE NPSTEST with demand = 1090 on "DSKOOI", 
demand = 2000 on "DSKSYS" 


would sst asid2 1000 blocks on the disk named "DSKO0O1" and 
BAGO.) blcck= on the disk named "DSKSYS" for the database 
WNPSTEST", 


Once the database has been assign2d disk space, 


relations in that database may be created as follows: 


CREATE relation-name (({field name) = (format),..., 
(field name) = (format)) 


hace 


t-< 


The above command would sét up an empty relation 
database +o which turfles could then be appended. A database 
is opened by Ssimplying entering: “OPEN (database name)". 

myeecrder. t¢ bulkioad records into relations in 
specified databases, utility vorograms have been provided. 
The experimental relations that have been generated on tha 
IBM 3033 system and subsequently loaded into the UNIVAC 
system have been ‘translated into the backend machine using 
+hese utility pregrams. 

Rene ce ay, we have attempted +9 manipulate the 
placement of relations in a database. That is, once a data- 
bas= has been allocated with disk space by the CREATE 


command, we have tried to force a specific placement cf 4 
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Pitewen ON a Pati icular disk. W2 have éessumed that for a 
join, optimization can be achieved if the relations tec b3 
joined are physically located on different disks. HOWSVEr, 
cur attempts at placement have proven futile. The designers 
of the tackend machine utilized certéin placement algo- 
rithms. These algorithms are ovroprietary and are, there- 
Pome, Unaveilable for our modificatio 

The query language for the machine allows +h2 crea- 
meen Of ind:c=s £o= quicker data access. Re S6Leaq= senor 
these indices and their use is described in the following 


S-ce "Cn. 
3. Indices 


Simply stated, indices are designed +o provide more 
direct access to stored data. The query language for the 
relational database machine allows for the creation of iwo 
@ers erent types of indices. A "clustered" index is one for 
which the tuple is fhysically in the order of +he value in 
the specified field. A "nonclustered" index is one that is 
G@eeqeea for a field crgroup of fields for which the tuple 
ommoc Clustered. 


Note that in NPSTEST all of #he relations have been 
=hey a 


created with clustered indices. Also, as y are described 
below, indices for certain relations in other experimental 
databases may be created, a¢estroyed, and then recreated 
miss ng uh eounse OE the run wstrsan £02 a2 particular join 





Table II describes the experimental databases. As 
they are explained more fully below in individual experiment 
descripticns, tha size of the databases, zhe number cf rela- 
tions in the databases, and «the indices employed are all 


factcrs in the measurements obtained. 
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TABLE II 
The Experimental Databases 
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E. THE QUERY LANGUAGE FOR THE DATABASE MACHINE 


INGOiporaved as adn 2ategqral part of the relational data- 
base system is the guery Language (RQL in the case of the 
RDM 1100). This query language is designed to be both 2 
definiticn language and manipulation language for the data 


stored in the machine. 


1. Semantics and Syntax 





The use cf the CREATE command for both databases 
and relations has previously been discussed. The following 
discussion seeks to descrite these features of the query 
language that are essential to an understanding of the 
hature of experiments that have been conducted on the join 
cperation. 


a. BEGIN (transaction nama) 


This ccmmand is us2d whenever multipl2 RQL ccmmwnands are to 


be treated as a single command. 
gk. END (transaction name) 


This ccmmand is used at <he‘end of the group of RQL commands 
under BEGIN. 


Cc. CREATE VIEW (view name) 


fiz= CcenNMand 15 used to set up a virtual rcelation within a 


database. 


d. DEFINE (stored command name) 


This command is used to define a stored commard for a parti- 
cular database. The command so defined can be referenced 


Simply by its name. 
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€. DESTROY (object nams) 


This ccmmard is used to eliminate databases, EFelavtious, 
madeces> Views, Stored commands, ct other constructs fron 


the systen. 
£. RANGE of (range variable) is (relation name) 


Range variables are used to allow the user tc establish a 
synonym for a relaticn name. Once this synonym is estab- 


[ietsnhec 2=+ Can be useq in lieu of the telatbon name. 


Couetemeetemeee(canget list) WHERE (qualificazion) 


ry 


This is the most essential command for perfecrming jcin cvsr- 
ations. Relaticns cr pertions of relations are pulled fron 
storage and displayed for “he user. the data  cetrie ved 
depends cn the user supplied qualification which may include 
Singular or multiple equalities and inequalitiss. Sjemeac) ee 


Bretas Can be specified ir the target lis<. 


he. RETRIEVE (variacble name 


GETTIME ()) 


Porerue 1S a £UnCcCtIicn in ROQL that allows the user to 
retrieve a time statement from the RDM clock. The time 
integer retrieved is in 1/60 seconds. As will be described 


relow, we used these times for our computations. 


2. The Experimental Queries 


Cueries have teen designed utilizing those features 
of RCL described  akove. The query streams have be¢en 
designed as sets of transactions, and the joins have been 
designed as stored ccommards so that the commands could be 
pre-parsed in order that parsing time would be eliminated 
from the join time measurements. The number of fields 
tezaeted for a join is described below in individual ¢xperi- 


men* descriptions. 
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III. THE BENCHMARKING 


A. GCALS 


The experiments described in this paper are directed 
towards the development of a procedure by which database 
machines may be benchmarked. The efforts described here 
represent cnly a fertion of th2 reséarch. Interested 
MeeaerS are cirected =o [Ref. 1], {Ref. 2], and (Ref. i EOa 
additional research cn selection and projection, database 
administration, and database generation. 

The gqcal of these experiments is not to nake a defini- 
five pronouncegzent cn the performance cf the varicus 
Seat20uraticns of the RDM 1100. Racvhsr, the goal is t¢ 
learn how to design kFenchmarks and interprat the results of 
the kenchmarking 2xperiments. Towards this end, the methcd- 
ology must be machine independent, and tke weorkicad model 


must te btased on a mix cf database management statements. 


Be. THE METHODOLOGY 


The werkload has rFeen modeled as a collection of queries 
in the relational query language (RQL). The orimazy bench- 


Sacx Kernel for the join operations is «he RETRIEVE state- 


ment with associated qualifications. In designing this 
workicad, classes of queries have bean identified. These 
include data-intensive and overhead-intensive classés. The 


worklcad has been ccnstructéed as a combination of queries 
from ¢ach class. Tke query languag?e has functioned as the 
primary tocl fcr performance maasurement since neither 
sorftware nor hardware probes have been available for use in 
conducting these experiments. UsStiGdeenes SUNCE TONS DEO 


in the quéry language, elapsed times ars measureable fro 
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the datakese machine clock in seconds. Sincommcnic gOa) Of 
these experiments is to learn the effects of varying parame- 
ters on machine perfcrmance and not absolute machine perfor- 
mance, this "rough" measurement technique is acceptable. 

The cperating system of the host machine allcws the use 
cf pre-defined commands and queries known as scripts which 
has elinzinated the fluctuation of *erminal time. 
Additionally, the fluctuaticn of the parse time has been 
eliminated by using pre-varsed commands stored in the data- 
base. However, scme fluctuation is introduced by «the 
gquary-post processor which formats data fer scr3en display, 
but this is not significant within the query sets. 

The initial appreach to defining relevant queries has 
been to concentrate cn the repetition of certain cperations. 
SisL1gd chis vepeitition, given factors have been varied to 
ascertain effects on performance. For the join operations, 
tuple sizes, databas¢ sizes, index structurs, disk place- 
men+, and machine configuration have been varied. 

By and large, the query streams have beén run ina 
rolled environment. MOWOLt ESCCumCnO EWOEK Oecd Variability of 
the hest machine, runs have heen conducted during times of 
Minimal activity on the host. Likewise, use of =he database 


machine has been restricted *0 a singls user. 


C. THE JCIN OPERATICNS 


Several groups of experiments haves been conducted during 
which certain varameters have been varied during reretitions 
of the same experiment. These experiments have been 
designed to cbtain measurements on one particular aspect of 


relaticnal database management: the join operations. 
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Poe cemalebef intron GE a Jozn 


Simply stated, aejetmees a COMpOSition Of two or 
more relations. In relational algebra, dee oO2tmecan | be 
expressed as follows: the @-join of column x of «able R and 
column yof table Sis a t*able whos rows azein the 
Cartesian Product of Rand S, such that, for the mathemat- 
ical operator 9, the row element x of R and the row slement 
wapot S held true for @. 


2e Zhe Join 


It 


n the Benchmarked Query Language 


In the benchmarked query language, RO Gis, oc Olne 2s 
accomplished uSing the RETRIEVE, RANGE, and gualifier WHERE 


commands. For example: 


RELATION: RELATION: 
Pre RSOOUNNEL Dt PARTMENT 


| YNA*E PPHONELIFFICEt 
{ eee I 

hors $8PS 4 2At 
{ 





t 

Fak natn eee 
rat | 

! 

{ 

{ 


1BRUWN $25 j 

faH ite $ 2h] OPS ,ADMENE Zod | 142 

(SMITH f 32} ao win 1SNG § 29 : 144 
i 


_— ne 
fee 








Given the akove relations, a typical join query in RQL could 


as 
(D 


RANGE of P is Personnel 
RANGE of D is Depa 
RETRIEVE (P.lastname, D.phcone, D.offics) 
WHERE P.dept = Den 


Piecaguety weuld return: 


ae Se SS de = eS ee 


1BROwN 251 | t41 
J aH ite 264 | 141 
om 205 ] 14? 


> dee ee ap va 
he b> eb Ep o> 


| 
| 
| 
| 


ee eee ee = oe 


26 





Do. EQUALITY JOINS 








An equality join is one in which © is défined as the 
mathematical e¢qualtiy (1.8., =). That is, the stavtement 
fPolioOwing <he qualifier WHERE in ROL contains either a 
Singular cr multiple equalitiss. For ¢xampleée, Son 
relations descrited above, the — retrievals repre- 
Sent twe different eguality joi 

Ss OBB g inane, > crone 
WHERE P.Dep ty = D.name and P.age = "25" 


Of 


RETRIEVE (Pp. lastname2,D.phons 
WHERE P.dept = D.name and D.name = "OFS" 
and Poage = "25" 


2. The Databases Used 


Equality joins represent the vast majority of ¢exper- 
iments cenducted during this research. Equality joins have 
bean conducted on all of the databases listed in Table II. 


3. (Queries Used 


Fgquality joins have been run with both singular and 
multiple qualificaticns (:.¢., Singular or multiple éguali- 
foec in the WHER®S clause). mies DNagor2<y Of fhe joins have 
been conducted on singular quaiifications, and *he discus- 
Sions below focus primarily on thos? experiments. The 
mMultiple-qualificaticn joins will be discussed separately 
The singularly-qualified joins are equated on the KEY field 


Se cach reiation. 
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4. Results 


As previcusly explained, the nmathodology emphasi 
varying ryarameters throughout the repetition of «he same 
group of experiment The queries and their resu 
presented below, and they are grouped by the parameter that 
has been varied. 


eee ee vatbaapwiary Of Relation Size and Tuple Size 


PAG seme. I deDIcts three joins of relations 
whose tuple size is of 100 bytes. The first equality join 
invelves a relation of 500 tuples and another reiaticn of 
a000 tuples. The second equality join involves a relation 
of 2500 tuples and another of 5000 tuples. tae hese 
equality join involves a relation of 5000 tuples and another 
ef, .0000 tuples. It is clearly 2vident that the join tines 
increase linearly as the number of tuples b2ing jcined 


increases linsarly. 


we now vary the tuple size for all three rela- 
00S. Thus, we benchmark the threa ralations whose tuple: 
eezenisceer 200 bytes. This is depicted in Figure 3.2. The 


benchmark of the relations whose tupi2 size is 
is ae in Figure 3.3, and the benchmark 


e 
° 

whesée tuple size is of 2000 bytes is depicted in 
2 ie 


4 


on 

Figure 3.4. The linearity demonstrat 
PelesS again evident in these joins 

Bmore 3.5 1S a compilation of Figures 3 

3.3, and 3.4 in which the slopes (or the rates) of lineari 

hh 


faye De Gempared. moe lsoe=mpOL tance cCOmme.c tiga, ne bigge 


}~ 
{— 
iw 
0) 
e 


the tuple size there is; the steeper the slope wi 
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BENCHMARKED RELATIONS WITH LARGE, FIXED 
MUCIG Giza oemio00 BYTES - DATABASE 


ian 


© mw wees we Cee F Fee 6 Oe OEE FES FS SESS SS CSET ET EE SERS OL OME HE EES C8 BESET EESS OES EEE HS SESEEOTESES HOTS CED TET OST CBEST SECS TED SS COTES SHES SEs SESH CeCe ST EE ES 


coo awee oe 





® 
® ‘ 
s 8 
e 
o 8 
See oe et eoe ee ete Se oo ee Ber bo wwe mec oc we oes He COR Oe Cap OO ERS OBO OOBME SOOO See BES Rs TEs Conse esse eee esterase 00 808 805 O88 88S O08 BOS OST SS Boy CESSES ESTs 
. ® 
e ‘ 
+ ® 
® * 
. ® ¢ 
OO O86 COB OHS COTS OFS PORES OH ETS SESS SCOT ESES © OE SETS FO SEAMS OHSS COB ESET SEFC SSS CHEE O EHF SSH Se TERS COTBOO BES So OFF OS BOF OTe Sees ESO eS He FOF see KO Sows ee oes 
' ‘ 
® 8 
| 
. ‘ ‘ 
© ee OS £8 8 OOOO OO OOS OHS MFO OTe HS OOS OO TE FOS OS OOO SEO OT EMH OO ETOH SO HS CS SHERE CSET CS SHE SS SEEDED Sse r Ce Cores SE Se tes eee eee Fee Ba 8OeFS BO Oe Se OEE eB eseee 
® ‘ 
‘ ; ( 
. ' 
. . 
® ‘ 
Pe ees SSF OSSSSeert ee Se oo Fete BO OES OF8 SO8s £8 BBO EEE SH se Sees es ee sseeees 08 Cee e ets ae ence was sPJosecseoacseeegeboose we reeB eave ss ete eres ces ooes SB sone 
. s * 
‘ ® 
. . 
s . 
. ' 
® 
e s 
‘ ® 
‘ ® 
® ° 
Ce ee cence eee ec cee scene cee oe Bowe seers ensanes ewes cnseen force cece enc ce cone com ewe rs pose crass 
. . ° 
® . ‘ 
. e s + 
®Y e s . . 
‘ 
Seen eee +e cee seeraceewone Perr reer rrererr Tier erre fire Perro Ome he cm crc ewes ccc ewes cc en cee ead er reas tear cec cane oe: wee rergeceocacos 
=? ¢ + 
‘ * . 
6 e ‘ 
. 4 
Oe cocien ra eeee see ess Pees ee uae c cocce coe eee eu wed cute we eee wet nsecedeseeenccesescsnescereneesefecesessee cas seceeseseces aprmencece | 
= 
e . 
* 
a ’ : 
. ’ . ° 
Ee © Oo Oe OOo CF FOTO PSE OOS ORE TES CHEST oe Se CEST FEF BO Se OePBBEZeeeses eee Ce eee ee er ere re ee ee ee) waco ete- sees 
® ’ 
° 
. 
‘ 
° ‘ 
Oe ce Pewee 10 FSS lle Oe EHF CCHS SHES CESSES SESE SET SESS Ce Oe Oe ORCC OSS OF OT OE OF HOSS Et OF OS Bmw OE OT FEE OS OTST HOOES GOS SEBO E BOTS THO OES SHSEB SSS Peecesesece 
. ‘ . 
: | 
, 
OO 8 OOS Be 06088 o8 O80 08S HO He FOR es OS OHSS 6 OF SHS OS SORES BE OHS H4EBOT OES 0500888 8 Foe KEES STO SE Ee | 
' 
‘ 
e 8 
. . 
e ’ 
Se . meets ese 8 ee ace ot eews ease 4 je oese®eaesee we Be ceoeewe wee gr coe ce ce ewe we cores cones owes eeceecese { 
‘ 
, . 
* ' 
‘ ‘ 
eedee ee Le 
. 
. s 
® ® 
s a 
© ete e ce wee te how ne we eww ccs we cc een ew en date c cecens = ees ec ce wa ewe dee coe eee ncene we oe eee come sh cores seee 
. ' 
. 
’ s 
; 
. 
OT ee Te il | 
® 
‘ 
‘ 
‘ 
® 
Oe Oe OT a ee ee ee ee a eee ee a a — 
‘ 
8 
‘ 
s 
see es 
‘ 
e 
‘ 
‘ 
‘ 
BOT e Cee ©2508 l HFe 00 000808 SOO 86 OSS eH TET ES SETS FOSES OO Ete coset seteee cee ewe ce ens Met seneeen sen ates cae ewmaee Focccee scons cce se eseene wat hee ee ese ee 
* 
* 
8 


0 1000 2000 3000 4000 5000 


ee ie OO Wie 2S LS A ES NS A SS EL AS AS SS AD OE AS ES A SS TS ED a OS aA TI ea A AR I ee <a A, ASRS ae EPO wenSUGh- DS cess ena ea <a —enetepes _——— 
® 
s 
4 
8 
® 
® 
a 
® 
8 
® 
a 
8 
s 
® 
e 
® 
® 
® 
4 
‘ 
® 
® 
° 
~~ 
a 
® 
° 
é 
® 
é 
° 
s 
¢ 
® 
a 
® 
® 
a 
. 
° 
. 
. 
° 
® 
. 
® 
® 
é 
® 
. 
. 
® 
s 
. 
. 
8 
. 
. 
. 
. 
® 
® 
+ 
® 
® 
. 
. 
® 
6 
e 
s 
® 
oh 
. 
e 
s 
® 
' 
e 
‘ 
¢ 
. 
& 
e 
‘ 
s 
a 
s 
s 
s 
s 
a 
s 
® 
a 
® 
— 
e 
‘ 
° 
’ 
° 
s 
® 
‘ 
¢ 
. 
’ 
° 
® 
° 
° 
° 
® 
® 
¢ 
o 
e 
’ 
e 
. 
° 
° 
° 
® 
. 
a 
° 
° 


tuples fofned 


| 
| 
| 
ee ee eS a SP Se a ee ee ES ee eee a oe on | 


Requie 3.3 Benchmarked Relations - Large Tuple Size. 
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Figure 3.4 Benchmarked Relations - Very Large Tuple Size. 
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b. Variability of Databas® Size in Terms of Numbér 
©: BuleecKks 


Figure 3.6 depicts thre 
large database cf 31350 blocks, NPSTEST. Add 
represented in Figure 3.6 are the same three j 
*hree small databases, NPS4& of 150 blocks, NP 
mocks, and  NPS6 o£, 1500 blocks. The results of these 


Masates teveal that the block sizais nota significant 


JOl7 ObSsrae- ons Ove 


rh 


weecOL Le jCcin time. 
Cc. Variability of Database Disk Placamen= 


Every database, namely, NPS1, NPS2, NPS3, NPS11, 
NeS1i2, or NPS13 contains only two relations. NPS1, NbPS2, 
and NFS3 have been created on +he same disk. NPS11, NPS12, 
and NES13 have been created separately, e¢ach of which cccu- 
pies two disks. Pauses) dep Choma ne 2M] £Ge JOINS on 
Heol, N&SZ, and NES3 versus the same joins conducted on 
meet, NESIZ, and NPS13. The results strongly suggest that 
database disk placement, especiaily for relatively small 


Batabases, is nct a major factor in join time. 
dm Variability of Index Structure 


A query stream has been run on NPS11, NPS12, and 
HPS13. Hiigeng the <un the index structure on the relaticns 
in the databases has been modified from clustered to 
nonclustered and then eliminated. Figure 3.8 dé 
join times in @ach situation. From tha reas 
can be réasonabiy assumed that for relat 
“here is ne significant difference betw 
clustered and nonclusteredi indices. However, the join times 
for those relaticns with no indices have 2xhibited increases 


Seoonenceally. 
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Figure 3.6 The Impact of Database Block Size on Joins. 
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Paguce 3.7 The Impact of Disk Placements on Joins. 
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PIgure: 3.5 The Impact of Indices on Joins. 
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Ee Vicseap atv On. Machine Configuration 


As stated earlier, during the course of <hese 
experiments, the database machine béing benchmarked has 
Operated under three different conifigurations: 1/2-megabyte 


cache memory without a database > aii 2-megabyts 


° 
6 
a 
a | 
mv 
;3 


cache m2mcry without 2 database acca 2-megabytes 


cache mémcry with a database accelerator. Joins over rela- 


ct 
(Dp 
in 


tions with the fixed tuple size of 100 by E meeee ec ayecta 
bas2, NPSTEST? have been conducted en eit three 
Bemi 2 glsations. The comparative results of these joins are 
depicted in Figure 3.9. These results show that an increase 
in cache memory size from 1/2 to 2 megabytes improved jcin 
eeme by a factor of 27% to 31%. Wibewecdat.on Of thie daca- 


base accelerator to the 2-megabyt2 cache improved the join 


meme EY a Lactor of 6% to 12% only. These results would 
Z—comelecriyesanmdicate that, ror the jcin operation, é4 
larger cache memory is much more effective than the addition 
Cf a datetase accelerator. 


5. Ssisct: 


IS 


Experinents 


imeecga=t2cn tc thegequality Joins described so far, 
there has been an additional qualification designed to 
Pemect Om. y a certain portion of the joined tuples for 
display. The number of tuples to be display2d is to be 5 &% 


of the number of tuples in the smaller relaticn of the 


ct 
O 


W 
metaticne if seach jcin. NowaccoOnp lish Vhis obaeccc ve O° 
the jcin cf the 500-tuple relation and the 1000-tuple rela- 


eed pete sada@dicional qualification is ‘*¢ impose a "“< 25% 
eae htGemenmecn che KEY attribute. That is, the relaticns 


have keen joined on the equality of the KEY field in Sach 
reélaticn, and there has been the additicral qualifier hat 
those tuples to be display¢d must have a KEY value that is 
less than "25". Por the join of the 2500-tuple relation and 
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Figure 3.9 The Impact of Machine Configurations on Joins. 
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Meme -tUCic telat iécn the restrictzon is "< 725", and for 
eles jJ¢in Of the 5000-tuple relation and the 190C00-rupie 


moet cn she Festriction is "< 250", 


Figure 3.10 depicts the response times for these 
join selecticns. Figure 3.11 depicts the response times fer 
mene Sams 3o2rs for which there is ro 5%4-selectéon res *ric- 
fon. A COMratison of the results of gach join reveals that 
eepecially for the join of the larger relations the differ- 
eece in response time is proportionally gréater. These 
Significant differences are iikelv due to at least two 
rrevalant factors. First of all, there is an I/O cvérhead 
Stax undeubtediy comprises a@ major portion of the differ- 
meen) SfCOndly,; 2c is highly orobable that for «his type of 
fern the veelect operation 1s performed first, and *hen the 
Beewel jezn as perforned. A comparison of Figures 3.10 and 


Bei) wOUlG Support this hypothsis. 
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Figure 3.12 depicts a comparison between twe sets of 


three joins on the same relations with nonclustered indices. 


Tne first sét requires n S21 ec lomce tO abs Sor cd. The 
second set requires the relations *o be sorted on an attri-~ 
bute cther than the KEY attribut2 on which the index is 
based. The tomparative results of <n2 runs for these joins 
are closé¢. The plotted curves for the resporse times cross 
Seemsel vec. This may indicate that th? scrtiz 


Beme n=) beasts of a non-key attrionute does not impr 
join time. 

Begieren3.63 CepicloS a cOMparlson beatweer twe sets of 
she same three jeins for which th2 expression of the 
equality predicate has been reversed. For ¢hese particulacs 
joins the reversal of the expression of the equality predi- 


cats appears to ke insignificant as a factor in join time. 
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Figure 3.10 Three 5%4-Join Selections. 
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E. INEQUALITY JCINS 


A limited number of inequality joins has been conduc 


@uzing =he course of these ex iments. 





ae 


Fer these experiments an inequality join is one in 
which ®@ is defined as a mathematical inequality. iia wes 
eee Statement following the qualification WHERE in RQL 
mmeeasns 6icher "!" oer H<" or >t, This gualification has 
heen impesed on the KEY attribute 

2. Experiments 

Inequalities have been applied to the join cf a 
Pees Uple selacion and a 1000~tupis relation and to the jein 
eeeee 2700-tuple relation and a 5000-tupl2 telation. 

3. Disastrous Results 

The results of these joins have preven to be disast- 
rous. Fer even the smaller join of the 500-ctuple relation 
and the 1000-tupl¢ relation, the réespons¢ time has run into 
Neuss. This long response time has jecpardized the 
Mmemodrs=y O. whe SXperiments, Sines during =h2 course of the 
run the status of the host machine has 2xperienced siqnifi- 
Soe siaceWaclons in load conditions. Obviously, it may 
Eeaye the poin= that the insqualicty joins cannot be 
Supported by the machine with any reasonabie response time. 
F. THE THREE-WAY JOIN 

elo e tee on ahd Exanpie 

For thes experiments a thres-way join is simply a 
compesiticn of three relaticns via e2quality joins. The 
Pies sela tons Have been joined cn “the equality of the KEY 
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heme ou-e¢ of sach relation. Leese ec Oe Trelaevon= A, B, 
and C, the join has beésn accomplished WHERE A.KEY = 8. KEY 


2. Exreriments 


The three relations that hav2 been joined ara a 


eoO=—tupis relation, a i000-tuple relation, end a 2500-*urle 
n 


Belaticn. No selection testriction has been imposed cr the 
Hor]. 

The response time for this query is .8114 minutes. 
A two-way join, Mader sume laaeecond .-<lons, cf the same 


500-tuple and 1000-tuple relations has been accomplished in 

~-7011 minutes. The small increase of the response time from 

the two-way join to the three-way join of .1103 minutes 

move WCUl1d appear to furcher demonstraverthe siqnificance 

See ne  Cne-time 170 overh¢ad in joias. ih. (csaneG wea ds, 
C 


regardless of the numter of ways a join is to bea 


ct 


he cne-tigne I/0 cverhéad would consume a substantial 
Betescn Of the join tine. In this case, the ovarhead 
consumes abcut 65% of the threa-way join cime. 


fee OCLTNS VERSUS VIEWS 


1. Ike View in the Benchmarked Query Language 


imme he CREATE VIEW Ccommand 2s used =o set upa 
Mueelal &=lesOn which :s ccmposed of attributes cf ore or 
more relaticns. The VIEW is not ohysicaliy a relation. 
Pee = eS eee TI nition is stored ia the database. The 


BowlewanG@ 2Xteample creates a new viztual telation, LOCATOR: 
RANGE of P is Personnel 
RANGE of D is Devartment 
See AeeLOGATOR(Psnane,D. name ,D.ofrics, D. phones) 
WHERE P. dept = d.nane 
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2. Experiments on Views 


The views have been defined and stored in the arpro- 
priate databases, befor their use for comparison +9 jcin 
operations. POnpoOmimeneomne ows and the jOinS, pSoyection 
Mas been limited to five attributes, but no restricticn has 
been imposed on selecti 

The views have been created from a 500-tuple rela- 


moms andee 1000—-ruple relation: a 25900-tuple releticn and a 


5 


ByvQ-tuple relation; rime O0U0—egOl> “slacson andy a 
exist ee “da-akeces 
NPS11, NES12, ani NPS13, respectively. th 


a 
have been accomplished on these same relations and data- 


moooO-turle relation. These relations 


kewise, the joins 


bases. 

Figure 3.714 depicts the comparative response “+imne 
for ¢ach of the three situations. The remarkable similarity 
in response times between views and joins for these experi- 

nts would seen to point out that the viaws arte ne nore 
Seeoemecivyesand inefficient to use than che joins. In cértain 
Situations, however, “he views could be of greater valus, 
since they require very little disk spacs as compared te the 
physical space neéeced by the elo Les tO rt ene  Jelnsr 
Baa*tiomabiy, the view appears tc provide th e 


€ 
mbox ability for contrcoliing access *o the database. 
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A. GENERAL COMMENTS 


The experiments discussed above have revealed several 
interesting results, Torcamiyvethe Consistent Linear*=2y -in 
join times and the apparent significant join overhead 
maaogucteciy resulting from bus contention. Figure 4.1 


illustrates both of these charactersistics. 

MOSS aspect icaily, Pigure 4.1 depicts the total join 
time for various numkers of blocks of joined data. The 
inherent cverhead is clearly evident for éeccess to less than 
1000 blocks while these jeins involved with 1000 or nore 
flocks clearly demcecnstrate the consistent linearity as 
previously discussed. 

As also opreviously discussed, the GETTIME function in 
ROL has been the only measurement tool employed. Al] houga 
nc hardwar2 or software probes hav2 been available, the 
experiments that have b¢ean run using GETTIME hav2 provided 
encugh information so that some statement concerning the 
mzan cf attainable blieck access time can he made. Figure 
4Y.2 depicts the average block access time for each tuple 
template and tha effects on this average as the join has 
keen repeated over increasingly larger relations {in the 


humber of tuples). 


Phetmeaugre §4.2 152 31s eviagsnt that <he cverhead sf the 
initial access is being absorbed as the size of the rela- 
tions béing joined increases. By repeating the same jcin 
for increases in bcth blocks siza2 and number of tuples 
accessed, some representative mean access times can be 
ascertained. That is, the access time curves will approéch 
some asymptotic lower bound. 
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Figure 4.1 


Block Access Times. 
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Figure 4.2 Mean Access Times. 
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Pegure 4.2 also xeveals that for this part 

Pesem machine that 12 is more efficient (or profi 
PSrIrorm jcirs cn larger ralations. The access time 
smaller relations are much higher than the access 2 
the larger relations. As the te a 
increas2s, the mean access time demonstrates a convergen 
fo a@ representative number. This number, the mean acce 
meme, Can be considered an important charactersitic of this 


particular benchmarking experiment. 
B. A COMPARISON OF DIFFERENT ACCELERATOR/CACHE 
CONFIGURATIONS 


This benchmarking experiment has not been designed as an 


analysis of several differently configured RDM 1100s. 


However, while this benchmarking is making orcgress, the 
availability of more cache and the database accelerator has 
Pemaulated much interest in the performance differences for 
mne difttetent machine configurations. Therefore, consid¢r- 


able time has been expended towards accumulating comparable 
Mieaetcz €ach Of the three configurations on which experi- 
Ments have keen run. 

[ieee sarescr se Tif there is a brisi d&scussion of the 
@eetersices in join times for the relations cf 100-byte 
tuples. The tollowing discussien focuses cn the 24 joins 
@onducted On the database, NPSTEST, for each of the three 


Sontigurations. 


(nD 
Pu 
(D 


Table III summarizes the average percent+ag 


CG 
join time for each join as «he amount of cache is increased 
u 


from 1/2 megabyte to 2 neagbytes. Table III also summarizes 
the further decrease in join time as the database acceler- 
ator iS added to the 2-megabyte cache configuration. Tiss 


summary reveals larger décreases in the join time as the 
er 


sizes cf the relaticns being joined increase. ihe Goh 
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Comparison cf Jcins Conducted on Different Machine 
Configurations 
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O@rds, as the Pnitial join overhead is absorbed, the addi- 
ereaw c 


percentage cf appreximately 59%. Correspondingly, Lt 


7 


gy 


che increasingly décreases the join time bv a 


appears that the effects of adding a database accelerator to 
“he 2-méegabyte cache are less significant for the larger 


relations, although in 411 cases there is some imprcvemert. 


C. THE METHODOLOGY AND ITS LIMITATIONS 
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The ewethodclogy th is tac 
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ad the experimental 
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has fundamentally ¢cun 
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ePpproach of varying join parameters has and should continue 
mmmeecv de Teievant intormatzon from which insigh= can be 
drawn. However, as discussed abovsa, benchmarking is a rela- 
Sevely new ariee of tesearch in computesr science, and 
certainly the teéechniques that have bean applied throughou~ 
the ccurse cf these experiments can be improved and refined. 
A definitive performance pronouncement on the RDM 11090 


meses net been the ulti 


a 


eee GOete duc HOuene use of =the GETTING 
mime=107 Cf ROL. DOspice =2S) Biecassereoss" in. gstzin 
yverTtformance neasure2ments, the GrTtTIME function has been 
deemed accurate enough for the purposes of OUT axperinents. 


[ae yee. S ftunNnec-i0or hasS been considersd sufficiently 


accurate in view of the lack of other more accurate measure- 
ment tools. rrobes have not been available, and software 
packages for performance daza collection have been delayed 
and ar2 unavailable for thes¢ experiments. FuUTure attemets 
mompenchMarck Such a sySten should ut#iize additicnal netheds 


Bom de t=romining relevant performance data. 

The benchmarking c&f the RDH 1100 is a project of seen- 
Beagly JlOw perority at he command which houses the hes= 
OUNIVAC systen. Existing workloads demand vast amount of the 
system's resources, and in reality it has been quite diffi- 


Sieeeece GCLisomw etne =PvVLFONMene in wnich these sxperimnents 
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have keen ccnducted. tins the load Conditions of the host 
system mey have compromised the integrity of some results. 
waeeticnally, the majority of the experiments have been 
conducted from a remote terminal which has probably further 
degraded the experimental results. ODVTOuSly,) 2Olem cl. Sc, 
peEpretliy controlled experimentation is the ideal practice 
for kéenchmarking experiments. 

Sur Enability ter control the host environment raises yet 


ene 


oO 


another issue. The goal of the experiments has be 
collect méasurements on joins for which certain parameters 
are varied. However, a major parameter has not been varied. 
That parameter is the load condition of the host. AS 
described above, attempts have been made *o trun experiments 
at times cf minimal hcest activity. Eiueaec ual Draca«Lce, the 
datakas2 machine is likely to be penchmarked during periods 
Beepeak hest activity. Future benchmarking 2fforts should 
take this into consideration, and attemox~s should b2 made to 


Senmenecl and vary host load conditions as part or the mix o 


DP th 


guery scripts. In view of the minimal host activity, th 
results we have obtained may be considered as the op*imai 
verformance cf the RDM 1100 for jcin opsrations. 

As the deadline fcr submission of this thesis has drawn 
hear, clanned experiments have been cancelled from the 
Sting agenda. A "rime crunch" has resulted from a variet 
Sr SCuUrces. Primary of these sourc3s has been the contin- 
Uing requirements to correct software deficiencies that have 
bean identified as a result of the experiments «hat have 
been conducted. Likewise, =ho Genenging o2 fhe d 


a 
machine ccnfiguration has also severely curt inte the «ime 


available to run ‘the full set of planned exveriments. en 
essence, although a great deal of relevant data has been 
collected, the consistency of some data may be questionable 


Since a limited number of experiments has been conducted in 
each arta of 2xperimentation. 


3)2) 





Besides these limitations and deficiencias, the e¢xperi- 
ments that have been conducted have provided enough relevant 
imeormaticn trom which valuable ccenclusions can be drawn. 
The results of the jein experiments described here, when 
Co@bincd wath those results of selecticn and projection 
experiments, comprise a substantial starting point for the 
compariscn cf similar database machine architectures. They 
provide a sclid framework for benchmarking relaticnal data- 


tase machines. 


56 





LIST OF REFERENCES 


EOG@anoOwnCz, ROLSE cena, renchWatking the Selectien and 
Projection Operations sig USE aes “Capabiiities of 
Relational Database Machines, G.5. TIhesis; Naval 
Feuee case SenooOtmmvedoorey, California, Seprember, 
Ryder, Curtis J., _Benchmarking Relational Database 
ele enn oe eee a che ele een Sn UT eS ,cne Vazanase 
Administtatorys Punctzor and Responsibilities, 4.5. 
Thesis, Naval Bo Saaraee «= Schoo ls Menterey, 
Selinrommid, scmrenberm, 1962. 

Senne ,; VEnce merc « , Design) Orerolaticnal LUatabese 
Benchmarks tes. .Thesis, aval Postgraduate Schocl, 
Monterey, Califcornia, Juné, 1983. 


iy, 





10. 


Tile 


Ac 


INITIAL DISTRIBUTION LIST 


No. Copies 


Defense Technical Information Center 2 
Saleorcn Station 

Alexandria, Virginia 22314 

Pee ye Code 0142 2 
Naval Soe e gees TSEHO Ou 

Mienprerey, California 93940 

Department Chairman, Codé¢, 52 1 
Cepartment of Computer Science 

Naval Pcstgraduateée School 

Miemeery, Calitorrza 93940 

G@itr2Ccula OCffleee, Code 37 1 
Cemeuter Technolc 

Naval Cet es > eo SF 

Menterey, California 9394 

Dewel ok. Us lee, cade e52 2 


Cemputer Science Department 
Naval Pestgraduate School 
MGn-Crey, CailtoOrnia 93 940 


Ms. EFaula R. Strawser, Code 52z 1 
Sepa: screenees Depart men 

Nevet Postqraduate school 

Menterey, California 93940 


IT Michael D. Crecker, USN, Code 52 1 
Ccmputezr Science Department 

Navai Postgraduate School 

Menterey, California 93940 


Semmandang Officer 1 
Naval Air Station 3 

iiNet DOLLS Mieczko, DPSC West (Coace 0340) 
Perea, Caliicrnia 93042 


Pevkee@acias J, Ryder, USN, Code 52 1 
Computer Sclence Department 

Naval Postgraduate School 

Mcenterey, California 93940 


LT Robert A, Bogdanowicz, USN, Code 52 1 
Cemputer Science Department 

Naval Postgraduate School 

Monterey ,ecatafotnia 93940 


Commanding Officer 1 
NAVGMSCCL Damneck 

Alls Leonevencent C. Stone, USN (Code 513) 
VerEginidescacnh,. Virginia 23461 


v2il Security Group Command 
lomCUnetomemee.gosk:, USN (Code G3QD} 
1, Nekraska Avenue fw. 

lak igke(eyohery, Dee 203 §0 


58 





13. 


Miss Fenelops F. 
pmo. Poy 45 


Crocker 


Demopolis, Alabama 36732 


39 

















